AITopics | Los Ríos Region

Collaborating Authors

Los Ríos Region

ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source

Le, Hung Tuan, To, Long Truong, Nguyen, Manh Trong, Van Nguyen, Kiet

arXiv.org Artificial IntelligenceMay-13-2024

Fact-checking is essential due to the explosion of misinformation in the media ecosystem. Although false information exists in every language and country, most research to solve the problem mainly concentrated on huge communities like English and Chinese. Low-resource languages like Vietnamese are necessary to explore corpora and models for fact verification. To bridge this gap, we construct ViWikiFC, the first manual annotated open-domain corpus for Vietnamese Wikipedia Fact Checking more than 20K claims generated by converting evidence sentences extracted from Wikipedia articles. We analyze our corpus through many linguistic aspects, from the new dependency rate, the new n-gram rate, and the new word rate. We conducted various experiments for Vietnamese fact-checking, including evidence retrieval and verdict prediction. BM25 and InfoXLM (Large) achieved the best results in two tasks, with BM25 achieving an accuracy of 88.30% for SUPPORTS, 86.93% for REFUTES, and only 56.67% for the NEI label in the evidence retrieval task, InfoXLM (Large) achieved an F1 score of 86.51%. Furthermore, we also conducted a pipeline approach, which only achieved a strict accuracy of 67.00% when using InfoXLM (Large) and BM25. These results demonstrate that our dataset is challenging for the Vietnamese language model in fact-checking tasks.

computational linguistic, corpus, linguistic, (15 more...)

arXiv.org Artificial Intelligence

2405.07615

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.05)
(26 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Media > News (0.48)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(3 more...)

Add feedback

Causal Discovery under Off-Target Interventions

Choo, Davin, Shiragur, Kirankumar, Uhler, Caroline

arXiv.org Machine LearningFeb-13-2024

Causal graph discovery is a significant problem with applications across various disciplines. However, with observational data alone, the underlying causal graph can only be recovered up to its Markov equivalence class, and further assumptions or interventions are necessary to narrow down the true graph. This work addresses the causal discovery problem under the setting of stochastic interventions with the natural goal of minimizing the number of interventions performed. We propose the following stochastic intervention model which subsumes existing adaptive noiseless interventions in the literature while capturing scenarios such as fat-hand interventions and CRISPR gene knockouts: any intervention attempt results in an actual intervention on a random subset of vertices, drawn from a distribution dependent on attempted action. Under this model, we study the two fundamental problems in causal discovery of verification and search and provide approximation algorithms with polylogarithmic competitive ratios and provide some preliminary experimental results.

algorithm, graph, intervention, (16 more...)

arXiv.org Machine Learning

2402.08229

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Singapore (0.04)
South America > Chile > Los Ríos Region > Valdivia Province > Valdivia (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

RaViTT: Random Vision Transformer Tokens

Quezada, Felipe A., Navarro, Carlos F., Muñoz, Cristian, Zamorano, Manuel, Jara-Wilde, Jorge, Chang, Violeta, Navarro, Cristóbal A., Cerda, Mauricio

arXiv.org Artificial IntelligenceJun-19-2023

Vision Transformers (ViTs) have successfully been applied to image classification problems where large annotated datasets are available. On the other hand, when fewer annotations are available, such as in biomedical applications, image augmentation techniques like introducing image variations or combinations have been proposed. However, regarding ViT patch sampling, less has been explored outside grid-based strategies. In this work, we propose Random Vision Transformer Tokens (RaViTT), a random patch sampling strategy that can be incorporated into existing ViTs. We experimentally evaluated RaViTT for image classification, comparing it with a baseline ViT and state-of-the-art (SOTA) augmentation techniques in 4 datasets, including ImageNet-1k and CIFAR-100. Results show that RaViTT increases the accuracy of the baseline in all datasets and outperforms the SOTA augmentation techniques in 3 out of 4 datasets by a significant margin +1.23% to +4.32%. Interestingly, RaViTT accuracy improvements can be achieved even with fewer tokens, thus reducing the computational load of any ViT model for a given accuracy value.

artificial intelligence, machine learning, ravitt, (17 more...)

arXiv.org Artificial Intelligence

2306.10959

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
South America > Chile > Los Ríos Region > Valdivia Province > Valdivia (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adaptivity Complexity for Causal Graph Discovery

Choo, Davin, Shiragur, Kirankumar

arXiv.org Artificial IntelligenceJun-9-2023

Causal discovery from interventional data is an important problem, where the task is to design an interventional strategy that learns the hidden ground truth causal graph $G(V,E)$ on $|V| = n$ nodes while minimizing the number of performed interventions. Most prior interventional strategies broadly fall into two categories: non-adaptive and adaptive. Non-adaptive strategies decide on a single fixed set of interventions to be performed while adaptive strategies can decide on which nodes to intervene on sequentially based on past interventions. While adaptive algorithms may use exponentially fewer interventions than their non-adaptive counterparts, there are practical concerns that constrain the amount of adaptivity allowed. Motivated by this trade-off, we study the problem of $r$-adaptivity, where the algorithm designer recovers the causal graph under a total of $r$ sequential rounds whilst trying to minimize the total number of interventions. For this problem, we provide a $r$-adaptive algorithm that achieves $O(\min\{r,\log n\} \cdot n^{1/\min\{r,\log n\}})$ approximation with respect to the verification number, a well-known lower bound for adaptive algorithms. Furthermore, for every $r$, we show that our approximation is tight. Our definition of $r$-adaptivity interpolates nicely between the non-adaptive ($r=1$) and fully adaptive ($r=n$) settings where our approximation simplifies to $O(n)$ and $O(\log n)$ respectively, matching the best-known approximation guarantees for both extremes. Our results also extend naturally to the bounded size interventions.

artificial intelligence, intervention, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.05781

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Singapore (0.04)
South America > Chile > Los Ríos Region > Valdivia Province > Valdivia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

Informative regularization for a multi-layer perceptron RR Lyrae classifier under data shift

Pérez-Galarce, Francisco, Pichara, Karim, Huijse, Pablo, Catelan, Márcio, Mery, Domingo

arXiv.org Artificial IntelligenceMar-11-2023

In recent decades, machine learning has provided valuable models and algorithms for processing and extracting knowledge from time-series surveys. Different classifiers have been proposed and performed to an excellent standard. Nevertheless, few papers have tackled the data shift problem in labeled training sets, which occurs when there is a mismatch between the data distribution in the training set and the testing set. This drawback can damage the prediction performance in unseen data. Consequently, we propose a scalable and easily adaptable approach based on an informative regularization and an ad-hoc training procedure to mitigate the shift problem during the training of a multi-layer perceptron for RR Lyrae classification. We collect ranges for characteristic features to construct a symbolic representation of prior knowledge, which was used to model the informative regularizer component. Simultaneously, we design a two-step back-propagation algorithm to integrate this knowledge into the neural network, whereby one step is applied in each epoch to minimize classification error, while another is applied to ensure regularization. Our algorithm defines a subset of parameters (a mask) for each loss function. This approach handles the forgetting effect, which stems from a trade-off between these loss functions (learning from data versus learning expert knowledge) during training. Experiments were conducted using recently proposed shifted benchmark sets for RR Lyrae stars, outperforming baseline models by up to 3\% through a more reliable classifier. Our method provides a new path to incorporate knowledge from characteristic features into artificial neural networks to manage the underlying data shift problem.

artificial intelligence, machine learning, regularization, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.ascom.2023.100694

2303.06544

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Chile > Los Ríos Region > Valdivia Province > Valdivia (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(4 more...)

Genre:

Overview (0.93)
Research Report > New Finding (0.92)

Add feedback

Informative Bayesian model selection for RR Lyrae star classifiers

Pérez-Galarce, F., Pichara, K., Huijse, P., Catelan, M., Mery, D.

arXiv.org Artificial IntelligenceMay-24-2021

Machine learning has achieved an important role in the automatic classification of variable stars, and several classifiers have been proposed over the last decade. These classifiers have achieved impressive performance in several astronomical catalogues. However, some scientific articles have also shown that the training data therein contain multiple sources of bias. Hence, the performance of those classifiers on objects not belonging to the training data is uncertain, potentially resulting in the selection of incorrect models. Besides, it gives rise to the deployment of misleading classifiers. An example of the latter is the creation of open-source labelled catalogues with biased predictions. In this paper, we develop a method based on an informative marginal likelihood to evaluate variable star classifiers. We collect deterministic rules that are based on physical descriptors of RR Lyrae stars, and then, to mitigate the biases, we introduce those rules into the marginal likelihood estimation. We perform experiments with a set of Bayesian Logistic Regressions, which are trained to classify RR Lyraes, and we found that our method outperforms traditional non-informative cross-validation strategies, even when penalized models are assessed. Our methodology provides a more rigorous alternative to assess machine learning models using astronomical knowledge. From this approach, applications to other classes of variable stars and algorithmic improvements can be developed.

classifier, informative bayesian model selection, rr lyrae star classifier, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/mnras/stab320

2105.11531

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Chile > Los Ríos Region > Valdivia Province > Valdivia (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (1.00)
(2 more...)

Add feedback